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THE APPLICATIONS OP PROBABILITY TO CRYPTOGRAPHY 


The theory of probability may be used in cryptog sphy 
with most effect when the type of civher used is Already 
fully understood, and it onlv remains to find the actual 
keys. It is of rather less value w hen one ms trying to 
diagnose the type of cipher, but if definite riv<=l theories 
about the type of cinher are suggested it may be used to 
decide between them. 

. -eaniny of probability and odds . 

I shall not attempt to give a systematic account of the 

theory of probability, but it may be wofcth while to define 

shortly ’probability' and ’odds’. The probability of an 

event on certain evidence's the propoxrtion of cases in 

'which th8t event may be expected to hapnen given that evidence. 

For instance if it is known that 20 % of men 'live to the age 

of 70, then knowing cf Hitler only » Hitler is a man* we can 

say that the probability of Hitler living to the age of 70 

is 0.2 . Suppose however that we know th°t ’Hitler is now of 

0.5 

8ge 52' the probability will be quite different, say 
because 50% of menxiiimxtoDCfelaExxgE of 52 live to 70. 

The 'odds’ of an event hapuening is the ratio ^/b ? xir 
where J 3 is th e probability of it happening. This terminology 
is connected with the common phraseology ’odds of 5:2 on’ 
meaning in our terminology that the odds are 5/2. 



Probabilities baaed on part of the evidence 


When the whole evidence shout some event is taken into 
account it may be extremely difficult to estimate the 
probability oijthe event, even xxtxs very approximately, and it 
may be better to form an estimate based on a part of the evidence, 
so that the probability may be more easily calculated. This 
happens in cryptography in a very obvious way. The whole evidence 
when we are ttrying to solve a cipher is the complete traffic , 
and the vents in question are the different possible keys, and 
functions of the keys. Unless the traffic is very small indeed 
the theoretical answer to the problem ’whart are the probabilities 
of the various keys ?’ will be of the form * The key ... has 
a probability differing almost imserceptibly from 1 (certainty) 
and the other keys are virtually impossible’. But sxocxxkxia 
±t®xxxxxlxx*xKxxxx a direct attempt tte> determine these probab- 
ilities would obviously not be a practical method. 

A priori probabilities 

The evidence concerning the possib lity of an event occurring 
usually divides into a part about which statistics are available, 
or some mathematical method can be applied, and r less definite 
part about which one can only use one’s judgment. Suppose for 
example that a new kind of traffic has turned up and that 
only three messages are svaileble. Each message has th letter V 
in the 17th place and G in the 18th rlace. We wsnt to know the 
probability that it is a general rule that we should find V 
and G in these places. We first have to decide how probable it 
is that a cipher would have such a rule, and as regards this one 
can probably . only guess, and my guess would be about 1/5,000,000 . 
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This judgment is nor entirely tsrxnxanaxijn a guess; some 
rather inaccurate mathematical reasonin' has gofcft- into it, 
something like this<- 

The chance of there being a rule that two consecutive letters 
somewhere after th^lOth should have certain fixed values s ems 
to be about 1/500 (this is a complete guess). The chance ofthe 
letters being the 17th and 18th is about 1/15 (another guess , 
but not quite so much in th? air). The probability of the letter^ 
being V and G is 1/676 (hardly a guess at all, but expressing a 
judgment that there is no special virtue in the bigrara^e VG) . 

Hence the chance is 1/ 500x15x676 or about 1/5,000,000 . T^is 
is however all so vagur, that it is more usual to make the 
judgment ’ l/5 , 000, 000 f without explanation. 

Ther question as to whst is bhe chanc^ of having a rule of 
this kind might of course be solved by statistics of some 
kind, but the e is no point in having this very accurate, and 

h 

of course the experience of the cryptographer itself forms 
a kind of statistics. 

The remainder of the problem is then solved quite mathematically. 

Let us con side a large number of ciphers ’chosen at random* , N 

of them say. Of these N/5,000,000 of them will have the rule in 

question, and the remainder not. Now if we h ad three messages 

for 

of each of the ciphers before us, '-e should find that with esch 
of the ciphers with the rule, the thre mes^ ges have VG in the 
required place, but of the rejoining 4,999,999 N/5,000,000 
only a proportion 1/676 3 will have them. Rejecting the ciphers 
which have not the required characteristic we ^ re left with 
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N/5,000,000 cases where the rule holds, end 4,999,999 N/5, 000, OOOx 676 3 
cases where it does not. This selection of ciphers is a random 


one in question, and therefore the odds in favour of the rul e 
holding are N/5,000,000 : 4,999,999N/5,000,000 x 676 3 i.e. 


It should be noticed hat the whole argumen t is to some 
extent fellpcious, as it is as urned that there are only two 
possibilities, viz. that either VG must always occufc in that 
position, or else th ! t the letters in the 17th and 18th 
positions are wholly random. There are however many other 
possibilities worth consideration, e.g. 

On the day in question we have Vg in the position in question. 
On another day we have some other fixed psir of letters. Or 
In tbT^ position 17,18 we h n ve to have one of the four 
combinations VG, RE, OM, IL and by chance Vg has been chosen 
for all the three messaged we have had. Or •• 

The cipher is a simple substitution and Vg is the substitute 
of some common bdigramme, say TH. 

The possibilities are of course endless, and it is therefore 
always necessary to bear in mind the possibility of there being 
other theories not yet suggested. 

The a priori probability sometimes has to be estimated as 
above by some sort of guesswork, but often the situation is more 
satisfactory, suppose for example that we know that a ce tain 
cipher is a simple substitution, the keys having no specially 
noticeable propserties. ^uprose ^lso that we have 550 letters 
of such a message including five occurrences of P. We want to 
know how probable it is that P is the substitute of E. As before 
we have to answer tv/o questions .How likely is it that P woul d 



knovTi characteristics of the 



676 3 : 4,999,999 or about 60 ;1 on 



be the substitute of E neglecting the evidence of the five Es 
occurring in the message. Secondly 'How likely ere we to get 
5 Ps (a) if P is not the subs itute of E (£) if P is the substitute of E. 
I wil not attempt to answer the second question for the present. 

The answer to the fir^t is imply that the probability of any 
letter being the substitu&eof E is independent of wfr-t the letter 
is, 8nd is therefore always li?26, in particular it is 1 l/26 
for the letter P. The only ue s '-work here is the judgment that 
the keys are chosen at random. 

The ffactor Principle. 

Nearly all applications of probability to cryptography 
depend on the 'factor principle' (or Bayes' theorem). This 
principle may first be illustrated by a simple example. Suppose 
that one man in five dies of heart failure, and that of men who 
die of heart failure two in three hie in their beds, but of men 
who die from other causes only one in four die in their beds. 

(My facts are no doubt hopelessly inaccurate). Nov; suppose ve 
know that a certain man died in his bed. That is the probability 
th n t he died of heart failure? Of all men war numbering N say, w e 
find that 

Nx $l/5)x(2/3) die in their b^ds of heart failur- 

Nx ( 1/5 jx( l/3 ) ... elsewhere 

Nx (4/5)x(l/4) die in tlffeir beds from other causes 
NX (4/5flbc(3/4) ... elsewhere 

Nov; as our man died in his bed we do not need to consider 

the cases of men who did not die in their beds, and these 

consist of Nx (l/5)x(2/3) cases of heart failure and 

Nx (4/5 )x (l/4) from other causes, and therefore the odds are 

lx (2/3): 4x (1/4) in favour of he-rt failure. If this had been 

done algebraically thejresult r r ould have be r n 
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odds 

A posteriori jzrwteskiixfcT: of the theory 

odds 

= A priori xrobxtilltyx of the theory x 

Probability of the data being fulfilled if the theory is true 

x 

Probability of the data being fulfilled if the theory is false 

In this thess" 3 c ’theory’ is that the man died'of heart failure, 
and the ’data’ is that he died in his bed. The ;ener*l formula 
above will be described a the ’factor principle’, the r-tio 
Probability of the dat 1 - if theory true 

is zcalled the factor 

Probability of the data if theory ^Ise 
for the theory xfxti ix on account of the data. 


Decibanage . 

Usually when nur e ere estimating the probab lity of a theory 
there will be several independent pieces of evidence e.g. following 
our last example, where we went to kno^ r whether a certain man 
died of heart failure or not, we may know 
a) He died in his bed 
bfl) His father died of heart, failure 
c) His bedroom w«s on the ground floor 
and also have statistics telling us 

2/3 of men who die of he' rt failure die in their beds 

2/5 ..... . have f- thers who died of 

heart failure 

l/2 ...... . have bedrooms on the 

ground floor 

l/4 of men who died from other causes die in^their beds 

1/6 ....... have fathers who di d 

of he^rt failure 


l/EO of men • ’ho die of other cruses have their bedrooms on 

the round floor 

Let us 

i £xxs sup ose that the three pieces of evidence Rre independent 
of one another :fc«xgxxfctacS if we know that he died of heart 
failure, end sl30 if we knov; he did not die of heart failure. 
That is to say we suppose Xhaci for instance that knowing 
kx'TiiaiixHl^aEyrtxfacitHrxpcxjidxikarfc that h slept on the 
ground floor does not rrrke it any more likely that ihe died 
in his bed if we kne - - p ii along that he died of If'e^t failure, 
/■hen we make the^e assumptions the probability of a man who 
died of heart failure satisfying all three conditions is 
obtein~ed s imp " y t - multiplication, an- 3 Is (2/3)x(£/5)x(l/‘ )x, 
and likewise for tho% who died from other causes the 
probability is (l/4)x(l/6)x(l/20) , and the factor in favour 
of the heart failure theory is 


We may regard this bs the product of the three factors 
( 2/3 ) / ( l/ 4 ) and (2/5) / ( l/6 ) and (l/2)/(l/20) arising from the 
three independent pieces of evidence. Products like this arise 
very frequently, end sometimes one •'"ill net products 
involving thousands of factors, and l^rge groups of these 
factors may he eaual. We naturally therefore work in terms of 
the logarithms of th e fac ors.The logarithm of the factor, 
taken to the base lO^ 7 is called the T dec ibenr ,:e in favour 
of the theory! iksxkxse A ’decibsn’ is a. unit of evidence; a 
piece of evidence is worth a dediban if it increases the 
odds of the theory in the ratio lO^/^ 0 : 1 . The deciban is 
us das a more convenient unit than the ’ban 1 . The terminology 
ixxxxxx was introduced in honour of the f- raous town of Banbury. 




Us in£ this eminology e migh' nay that the feet that our m^n 
died i’ 1 bed scores 4.3 decibens in favour of the h cr, rt failure 
I theory (101og(8/5) = 4.3). le score p further 3.8 d^oibans for 
his father dying of heart failure, and 10 for his having his 
bedroom on the ground floor, totalling 18.1 decibans. We then 
bring in the a priori odds l/4 or ni 10 a nd the result is 

that the odds ar r 1C , or as vre may say *12.1 decigcbans 

up on evens’. This means about 16;1 on. 


Chapter II. Btraightforwcrd cnnto '^phic problem? . 


- \ 

Vi. ;energ . 


\ 


The factor rrinciole can he applied to the solution of a 
Vi enere problem with gr at effect. I ,,r ill assume here that 


theory may be applied to this cert of the problem also, but 
that is not so elementary. Suppose our cipher, written out 
in its correct period 13 


DK^H-SHZNH? 
HC VXUHTiSA Q, 
XHPUEPP3BK 
TrtUJAGDYOJ 
THWCYDZHGA 
PZKOXOEYAE 
BOICBUBP I 5 R 


(It is only by chance that it makes a rctengular arwsy). 

Let us try to find the key for the first column, °nd for the 
moment let us only tajfce Into account the evidence afforded 
by the first letter D. Let us first consider the key B. The 
factor principle tells us 

Odds in favour of key B 3 A priori odds i f^our of key Bx 
Probability of getting D|in cipher if key is B 


The probability of -etting D in the cipher with the key B is 
just the probability of getting C in the deep which (using the 
count on 1000 letters in Pig 2} is 0.021 . if however the key 
is not B we can h°ve any let ^r other than C in the clear, and 
the probability is -(1- 0.02l)/25 . Using the evidence of 


the period of^ the cipher has elrend-’- bean dtermined, Probability 


x 


Probability of getting D in cipher if key is not B 


Now the a priori odds in favour of cey B may be taken as l/25. 



odds in favour of the key B ^re 


?X > o. o'l / 


I — o. O'L. 1 
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,7e %iy then consider the effect of the next letter in the column 
R which gives e furthe~ factor of 25x 0.064/(1-0.064). V/e -’re 
he e e ' suming thet the evidence of the R is independent of 
the evidence of the D. This is not quite correct, but is ° 
useful °prroximstion: 8 more 'c^ur^te method of calculation 
• ill be -iv-n later# Let us • rite ^ for the fr< quency of the 
letter < in lan r ua ge . Th°n our fi r,r l estimate for 

the odds in favour of key B is 


. is the series of let ers in the 1st 

colunn, and we use lett c id numbers interchangeably, A 
meaning £ 1, B mesning Z mining 26 or 0. Tore generally 

for the oddg are 


then decodes the column with the various no^sible keys, look" 
up the decibanages, end ^dds them up. 

The most convenient form for 6oing this is a table of values 


may say, the values of the score in’half decibans’ <)>. One nay 




of 



taken to the nearest integer^ or as we 


al o have columns shov/ing multiples of these, end tie table 
made of double height{Tig 3). Tor the first column with key B 
the decoded column is (X> S . .0]^ and we score -5 for C, -26 for Q , 
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-5 for 8 W, 17 for the three letters S, 5 for 0, # for A and 
-10 for V, totalling -17. The <s lculations can he doif'e 
very quickly by the use of the transparent gadget Fig 4, in 
which squares are ringed in pencil to show he number of 
letters occurring in the oolunn. The gadget may be placed 
over Fig 3 in various positions corresponding to the 
various possible keys. The score is obtained by Q dding un the 
numbers showing through the various ssusres. In Fig 5 the 
nurnsibjbs alphabet has been written in a vertical column 
below the aibfter text of Fig 1, e • : let er raor- anting a 
possible key. The snore for e ch key lv ^ been written opposite 
the key, and under the relevrnt column, ^n X denotes a baa 
score, not worth adding up,. Usually the ill be -15 or 
worse. It ••ill be seen that for the first column ?, h- ving 
a score of 43 is extremely likely to be right, especially 
as there is no otb r score better than 8. If we neglect this 
latter fact the odds for the key are (1/25) 10 ^* 1o i.e. 
about 5:1 on. The effect of decoding this column with key P 
has be n shown underneath. For the second column thr best 
key is 0, but is by no mean# so cert°irjas the first column. 

The d code for thig. column is ^ Iso sham, an d provides very 
satisfactory combinations with the first column, confirming 
both the keys. (This confirmation could also be based on 
probability theory, given a table of bigramme frequencies). 

In the third column I and C are best “lthough D would be 
very pos ible, and in the fourth column l and U are best . 
Writing down the possible decodes we see that the first line 





must read OWING- and this makes the oth^r lines read CONDI, ITHA3, 
EL IPO ,ETOIM,ALCUL,MA C i, HIS 13 ,EGHET. By fill in In the word 
’conditions’ the whole can no” be decoded. 

A more pcour^te argument would run ^s f ol ' o’ s . For the 
first column, instead of setting up c s rival theories the 
t" : o po sibilities th n t B is the key «nd that B is not the 
key we can set up 26 rival theories that tlfe key is A or B 
or ... or E, and we may apply the factor principle in the forra:- 


A posteriori probability of key A 


A priori probability of kpy A x Probability of getting the given 


o 


column '"ith key A 


* 


A posteriori probability of key B 


A priori orob- bility 


of key B x Probability of -et ing the 
given columiWith key B 


= etc . 


The argument to justify this form cf f'-^tor principle is really 


the Same as for the original form. Let 
probability of key a . Then ou~t, of N 
cases of key /3 . Let 

get ing the key column 


Then ou t of N cases we hsve N 



V 


rejected the case where we get columns other than C we 



"e hove th refore 1 o oalcnlate the nrob^bility of getting 
the column C with ke 1 * p> a n a th#is is ^inply /.I *]** *'’/* + » 
i.e. the product of the frequencies of the decode letters 
which we get if the key is 


the column C 
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Since the a priori probabilities of the ’ceys are all equal 
say th< t the a posteriori probabilities are in the 
ratio , i.e. in the -at io (\ 7 ^. ^ { ) 

which is mo”e convenient for calculation. The '"’inal va lue for 
the probability is then 

' (tt T ! f'J 

lCUl f*- - >54 » ) 

be done by the method recommended before for II X sr^ . h , / /- 7 ) 

in fact — ^ L / • *' P f 

(The table i Fig 3 mas f ade up ^or It xZl'p*; -a+ /} . The differences 

1 rather 

betv;e n the two tables would of course be xssqr sli 'ht). The 
new esult is more ccurate than the old because of the 
independence assumption in the original result. 

If we only went to *cnow the ratios of the probabilities 
of the various keys there is no ne d to calculate the 
dent ®b 

another im^o^tance : it tarti-K-x-mrjrfr Mig x I rt V g i -y gives us some 
evidence about our other as umptions, such as that the 
cioher is Vigenere , and that the period is 10. T^is aspect will 
be dealt "mth later (p. ). 


A letter subtracfror rroble 


/ 


A substitution with period 91 x 95 x 99 is obtained by 
superimposing three substitutions of periods 91, 95, snd 99, 
each substitution beinr a Vigenere composed of slides of 
0,1, 2, 3, 4, 5, 6, 7,8, or 9. The three substitutions are known in 
detail, but we do not know for any given message at what 
point in the c >inplete substitution to begin. For many 
mes ^ges however we can orovdde a more or less probable crib. 
How can we test, the .prob-- bility of a crib before attempting 
to solve it? It may be assumed that approximately ecual numbers 
of slides t>,l,... 9 occur in each substitution. 

The principle of the calculation is that owing to the 
way in which the substitution is built up, not all slides 
are equally frequent, e.g. a slide of 25 can only be 
the sum of slides- of 9,8 ana 8 or of 9,9 and 7 whilst a slide 
of 15 can be any of the following 

9.6.0 8,7,0 7,7,1 6,6,3 

9.5.1 8,6,1 7,6,2 6,5,4 

9.4.2 8,5,2 7,5,3 

9.3.3 8,4,3 7,4,4 

A crib will' therefore, other things being enual, be more likely 
if it requires a slide of 15 thanjif it requires a slide of 25. 
The problem is to make the best us<?of this principle, by 
determining the probability of the crib with reasonable 
accuracy, but without spending long over it. 

We heve to find out the probability of et‘ ; ing 8 -iven 
slide. To do this we can ep^ly several methods. 





(a) We can produce a long stretch of xx key by addition and 
take a count of the resulting slides. This is obviously a 
v°ry eneral method, and requires no special me thematic- 1 
technique.lt may be r°ther laborious, but by interpreting 

a small count with con on sense one can nrob-hly get suite 
good tesults. 

(b) There are 1000 pos ible combinations of slides ell 


equally likely, viz 000,001, ..., 999 . ifxxa: add up the 
digits in these and xxii take the remainder on division by 
26, and then count the number of combinations giving each 
of th e pos ible remainders. 

(c)We can make use of a trick which might appear to be 
rather special, but is realty ap^lica±ble to a multitude of 
problems. Consider the expression 


For e r ah possible wejnbf expressing a number k as the 
sum of thre<- numbers 0,...,9 say * there 


the third. Hence the number of ways of expressing K in 


We can 



m 

is a terra X 1 X ’ ^ 5 ih \ 1 coming out of 

the first fee tot, * out of the sedond , • n>' X •’out of 



the form h ^ -4 is the coeffinient of X 


in 


1 60 


i . e . in 



0-*V 


or in 






by the binomial theorem 


Expan ing 



C I ”*■') - / 3 + b / o + 

9 -h t,'sr* c i + kb*.'* + 7T x. 
-f 't»3> ^ -y- /-y/x' 7 -t- /<^o *. ,S -f~ 2 -10 
't 1 Zi> ^ ^ £,"/ -* 


‘ 1ST -A^ -y- 2 ./ 2 £? x ^ -y- 3 4*-' 

* /2 ~ t - /05“ -y- /Z.O x/^-y- 

sz3/x 7 % + 

4 o L^ 7 ■+■ u3J>'-*- I? -<- . 


Now multiply by 


/ - 3 *' r +3* 


1C 



end we yet 


j(x) - I -*- 3x + ^x v y- /O x3 f /rx *V 2./ **V -t-3 6 x 7 4.S" » 8 ■+■ $y 

-t ^3 A -r£<7* +• 73 % a -/- 75“ * ' 3 -h ys~ x'*-/- 73 X /5 lCj x (, 
■+ *~i>~ *' 9 -+**■!>' * ,c> - 1- 3b*' lo i- zP^'+i/^^/rn 1 ^ ^is'^ 


This means to sey that the chances of getting totals of 0,1,2,. 

are in th ^e ratio 1, 3, 6, 10,... The chances of get ing 

remainders of 0,1,2,... on division by 26 are in the ratio 

4, 4, 6, 10, 15, ... To get true probabilities these must be 

divided by their total which is conveniently 1000. 

(d) There are two other methoos, both connected with the last 

so much 

method but xgxxxxx not re lying ^on the special x features of 
the problem. They willjbe discussed later. 

Suppose then that the nrobabilit ies hav c bean calculated 
by one method or the other ( as in fact w^xT-fehBncExxxxfHr we 
have done under (c)). We can then estimate the values of cribs. 
Eet us suppose that a pos ible crib for a message beginning 
ra^SXOWBVMMK was XMEXXXXKXXXX AMBASSADOR so that the slides 
were 12, 9, 6, 22, 2, 0, 23, 11, 14, The slide of 12 gives 
us some slight evidence in favour of the crib beding right, for 



slides of 12 oeour with frequency 0.073 with right cribs, 
whilst -ith wrong cribs they occur with frequency only 1/26. 
The factor in favour of the crib is therefore 26x0.073 
or about 1.9 . A similar calculation may be made for each 
of the slides, but of course the work nry be greatly sneeded 
up by having the values of the factors 26 C s /l000 in half 
de oibs ns tabulated: here C s is the coefficient of X S in the 
above polynomial X(*) . The table is given below (Fig 6) 


1 0 -20 

2 25 -16 

3 24 -12 

4 23 -8 

5 22 -6 

6 21 -3 

7 20 -1 

8 19 1 

9 18 3 

10 17 4 

11 16 5 

12 15 6 

13 14 6 


Fig 6. Scores in half decibans of the various slides. 


Evaluating this crib by me-ns of this table we xxi score 

y +3 - 3 - L - U -10 ~ $ +\" V L (*- - 33 ) 

i.e. the crib is worse by a fpctor of io -3 3/20 than it was 

, # 

before e.g. if the a priori odds of the crib were 2:1 
against it becomes 98:1 against. This crimes in fact made up 



at rando^’acKBrocisi x i.e. the lett rs of the cipher text were 

chosen at random. Nov/ let us take one made up correctly, i.e. 

really enciphered hy the method in question, but with a 

random chosen key. 

NYXLNXIQHH 
AMBASSADOR 
13± 22 21 8 19 

12 11 5 13 16 (slides) 

This scores 15 so thet if it we re originally 2:1 against it 


Having decided on a crib the natural way to test it is to 
have a catalogue of tli~e positions in which a given series 
of slides is obtained if the 91 period component is omitted,. 


to this third componeh t, draw an inference as to v/hst 

is the part of the slide arising from the components of 

periods 95 and 99 combined. This we look up in the catalogue . 

This process is fairly lengthy, and as the scoring of the 

* 

crib takes only a minute it is certainly worth diking. 



We make 91 different hypotheses as 



Theor: r of repeats 

Suppose ■ We a cipher in -hich there ere sev r r- 1 very 
lone series of substitutions which can l5~e user’ for enciphering 
a message, but th at one may sometimes get two messages 
enciphered with tlfe same series of substitutions (or 
nos ibly, the s ^iars of substitutions for one me s" age being 
those ^or another —ith some at the be^in ing omit 4 ed). In 
such a case let us sav that the messages ’fit*, or th"t they 
fit at such and such a distance, the distance being the number 
of substitutions --hich have to be omitted from the one series to 
obtain the other seried. One mil" frequently want td|cnow 
r, hethet two messages fit or not, and me may find some evidence 
about this by examining the repeats between them. 3y the repeats 
between them I mean this. One writes out the cipher texts of 
tlfe two messages with the letters which are thought to have been 
enciphered 'dth the same substitution under one another. One then 

(L 

-■■•rites under these messa ;es a series of letters o and x, an fo 
b ins ’"ritten ’"here the cipher texts differ and n n r ’ here they 
agre- . Thessr series of letters o and x ,,r ili begin where the second 
message begins and end where the first to end end". XcxKwmnlBtfl 
thxx nf j dxkjx t± xx;-:xhout"ttig"Txr^ i±t ± m rrt laoc cE This series of letters 
o and x may be called the repetition figure. It may be completed 
by adding at the ends an indication of how many letters there 
are hich do not overlap, a nd "'hich message they belong to. 

As an example 

0-IilLIKvoG-VBI'.IILAi , lXivlvlOROGBYSKYXDAZCHI^IUlvIRK3ZLDLDOHGi,iVTIPRoD 

VLOVDY' ■CEJSOPYGBi ■LBiCCCDAZNBI’IOPTT’CXDOD 

8 - 11 
XOOOOOOOOOOXOOXXOOXXXXXXOOOOOOOOOOXOX- 1 -- 1 - 



^3 


On the ’-.hole one xpects that a fit is morelikely to 
he right the nor- letters x there are in the repetition 
figure, and that Ion series of letter -3 x ere especially 
desirable. This is because xxy it would not be very 
unusual for two fairly common words to lie directly under 
one snotherjwhen the clear texts are written out, thus 

TKEMA IN C ONV 0 YW ILLARH IVE . . . 

A LLC ONV 0Y5MJS TREP 0 RT . . . 
xooxxxxxxoooooxoooo . . . 

If the corresponding cipher texts realty fit, i.e. if the 
letters in the same column are enciphered with the same 
substitution, tlT'en the condition for an x in the repetition 
figure of the cipher texts is that there be an x in the 
repetition figure of the cor esponding clear text. Nov; series 
of several consecutive letters x can occur quite easily as above 
by two X 0 identical words coning under one another, or by 
such combinations as 

ITISEASIERTOTEACHTIiAI'IALGEBRA . - . 

THERA INWASSUCHTHATHEC OULD . . . 
ooooooooooooxxxxxoooooooo . • ■ 

if the messages really fit, but if not they c^n only oc ur 

by complete coincidence. One therefore tends to believe that 

there is a fit -hen one gets such series of letters® x . As 

regards single cases of x the value of them is not so clear, but 

■ plain language 

one can see that if p* is the frequency ofjletters o< in x±sxx 
then the frequency of letters x as a whole in comparisons of plain 
language with plain language is J 1 , whilst for wrong fits 

of cipher text it is l/26 which is necessarily less. Given 


a sufficiently Ion repetition figure one should therefore he 


able to tell whether it Is a fit oijpot simply by countin the 
letters x and o. 

So much is well known. The real point of this section is 
to show how these ide r -s can be developed into an accurate 
method of estimating the probabilities of fits. 

Simple form of theory . The complete theory takes r c ount.of 
the various pos ible lengths of repeat. As this theory is 
somewhat complicated it will be as well to give first tv/o 
simplified forms of the theory. In xjrsDcfca: both xxx cases the 
simplification arises by neglecting a part of the evidence. 

In the first simplified form of theory we neglect ail 
evidence except the number of letters x and the number of 
letters o. In the other simplified form the evidence is the 
number of series of (say) four consecutive letters x in a 
repetition figure. 

* 

When our evidence is just the number of times x occurs 


in the repetition figure/, and the length of the repetiti -n 


figure (N srjr) , then the factor in favour of the fit is 

Probability of a ri ht tit repetition figure of length N 
having n ocur ences of x 


As an ap 'roximetion we may as ume th°t xi f wjiaLXk the 
numerator of this expression has the s^me v r lue as if the 


by letter by h independent random choices, T "ith r certain 
fixed probability of et' ing an x at each stage. This 


Probability of a -wrong repetition figur c> of length N 
having n occurrences of x 



probability will 




Ji 



is then 

(Number of repetition natterns with length N ^nd n occurrences ofsc x) 


times ( Probability of retting a given ^repe tit ion oattern 
by tlf~e xmndmim process just mentioned ) 

which • r e may "rite as R(N?n) Q,(N,n) . Nov; let us denote by the 


the denominator, but here we must £^since all lett 

oc ur ecu' 1 y frequently in the cipher. The denominator is then 


decibans per unit length of repe+itionfi ure (’per unit overlap’). 

An • lternative argument, leading to the same result, runs 
a- folio' s. Having dedided to neglect all evidence except the 
overlap and the number of rg neats r e pretend that nothing else 
matters, i.e. tirt the form of the figure is irrelevant. In 
this care we can *ev,rd each letter of the repetition figure 
' ndependent evidence about the fit. If " r e get an x the 
factor for the fit is 

Probability of get' ing °n x th e fit is right 
Probability of et ing ^n x if the fit is •'Ton T 


such 



given 



In dividing to find the f ctor for the fit 



cancels out, leaving 


we score a factor of 



hi- 


i.e . 


Similarly une i c c or ior an o is 



In either form of argument it is unnecessary to calculate 
the number R(N,n) . In this particular c^se there is no 
particular difficulty about it: it is the binomial coefficient 
In 3ome similar problems t is cancelling out is a great boon, 
as we ight not be able to find any simple form for the 
factor which cancels. The cancelling out is a normal feature 
of this kind of rpoblem, and it seems quite tip turel that it 
should hap en -'hen we think of the second form of argument in 
which v? think of the evidence as consisting of a number of 
independent parts. 

The device pfj&ssuming, as we have done here, that the 
evidence vrhich is not ayr. ilab le is irrelevant c c n often 
be used and usually leads to good results. It is of course 
not supnosed that the evidence realty is irrelevant, but only 
th r t the xcrixiciHx f •■ntor er~or resulting from this as umotion 
when used in this kind of way i^ likely to be small. 

In th second fxrx simplified form of theory we take* a^ our 
ev'denc^that a particular part of the repetition figure is 
Oxxxxo (say, or alternatively oxxxxxo sajr) . The factor is 
t' en 

Frequency of oxxxxo ir right repetit i on f igum g 

F equency ’of o'-mm^o in •’Tong repetition figures 

denominator ) V 

estimated by tekin-s a sc mole of l c nguage hexagrams and counting 
the number of pairs that have the renet it ion figure oxxxxo. • 

The expectation of the number of such oairs is the sum ■for 
■°11 pairs of the xxtjbx probabilities of those naira hscxxx haying 
the desired repetition figure i e. is the number of such oairs 
(viz N(N-l)/2 ’ 'here N is the size of the )multiolied 



5 ? 


by the frequency of oxxxxo repetition figures. Thi£ frequency 

me 7 therefore he obtained by division if we equate the 
expected 

xxixxjt number of the e r petition figures xixfe to the actual 

number. 


statistics of ev r ry conceivable repetition figure. .<* must m^lce 
some •' s umntion to reduce the variety thet ne r d be considered, 
'he folio- inr c s umption is theoretically very convenient, and 
r lo apje°rs to be a very good ap ’>roxin''-tion. 

The prob - ~ il it ie s of repeats at two points known to be 


sep r - ted by a poinu vT here there is xn kno~ n to be mi p e 
independent . 

We may also as une that ttfe probability of a repeat is 
independent of an thing but the repetition figure in tis 
neighbourhood. (We may ha- ver as e refinement produce 
dif rent statistics for different types of messages, and 
differiT'ent xetXx positions in a message) . 7 /e can therefore 
think of a repetition figure as being produced by selecting 
the s-mbols of the igure consecutively, gyjtoiM i M the 
probability of get - ’ ing -n x at each st^ge being determined by 
the repetition figure from the point in question back as far as 
the last o. Sometimes this ’"ill take us back as far as the beginning 
of the message, and mill Include the number telling us how 
many more letters there are which do not repeat at all. We need 
in practice only distinguish t\ o cases, where this number is ± 0 
and " r hen it is mor . therefore have to distinguish the 
following cases / 



no repeat 




0 


some 


fc. 

none 

C 

o 

ox 


some 

X 

b 

none x 

L t 

oxx 

Civ 

some 

XX 

tv 

none xx 


oxxx 

• • • 


some 

• • 

XXX 


none xxx 

• • • 

s 


The entries A t J C{ ,) -^.opposite the repetition figures 
are the notations we are adopting foi? the nrobabilitjr of 
getting another x following Such a figure. Strictly speaking vb 
should also bring in a notation for the probability of the 
nes°age coming to an end after any given repetition figure. 

As the repeats at the end of a comparison do not appear to 
behave very differently fro-^those in the mein part of the 
message I shall neglect this complication by assuming thfet the 
probabilitjr of getting an o added to the probability of getting 
an x is 1, end that afterward- one cuts off the end of the 
series arbitrarily. , . . 

» 

Let us calculate the factor for the repeat figure 


none x 

X 

X 

X 

o 

o 

0 

X 

0 

X 

X 

X 

o 

C o 

c » 


s 

“S 


l '*o 


i-a, 


a ( 

a z- 

'- a 2 

1 


i. 

f 


%<c 

vf 

, 


1 

\ 


2f 


^ io 

2-6 

X6 

v* 


t2 

u. 


V(> 

VL 

u 

Zt 


Underneath each sy bol has been written the probability that 
on_e|would get that symbol, knowing the ones which precede^ both 
for the case of a right snd of a v-urong repetition figure. The 
factor for the fit is the product od the first row divided 
by the product of the second. It is convenient to dixidac split 
this up, as indicated by the vertical lines into the product of 




C 



'LL 


I- 

Izl 

1 -L 







«T» « / 

TlF 

and this product may be put into 

C » ‘‘ *>- C .l Q~ c i.) f \'^ 

IM H '*1a l ^ 7 -l ' 

“• l -*,) ., <-«* y*~ 

( *yu I 

yt. 



the form of -the product of 

whioh we call the factor for , 
an initial tetregram rep^et^ level 

the factor for a single repeat 

the factor for a trigramme 

the correction for a final bigr^mme 

the fector for an overlap Of 16. 


CL, C l‘* % ) 




la 




I L 


ft* J-i Ju qk- • . 


A 


■ f e shell neglect the correction for e final hi r°m e (or wlr tever 
it may be). It is in any c c se rather smal 1 , r -nd ±s vanishes if 
the repetition figure ends with o; also with our conventions 
the whole question of the ends of repetition figures has he -n 
left rather in doubt. 

Now let us put 


a. a , 

0 f 


1 I ~ ^ V * I ) 


« 


PM \ 




4 r 


C r C « - 

The values of the *- ir can he obtained as follows. We take a 
number of plain language messages and leave out t' r o or three 
words at the beginning. Then combine the mes'^sages to form one 
long message: this message may be made to f eet its own tail* i.e. 
it sjmy be rit - en round a circle. If the mes^nge wi ere compared 
with itself in every possibl position, except lev^l, we should 
expect to get repetition figures including which when divided 

- vertical lines r -fter each o, aontain^ fit*- 0 ^ (V^) 

parts which consist of r symbols x followed by an o, or as we may 

N t ~ * - peats]/ £h±sxExnxi* T] W w 

given A 

can be calculated fxxx t ’ pnarent number of r-gr repeats* /M ». 
for each r. This apparent numb r of r-gwpa e rape ts is the number 
of series of r consecutive symbols x ir^the repetition figures 
regardless of what precedes or follows the series. By considering 
the way s in which an actual repe t can give rise to apparent 
repeats of various lengths we see that 

n „ *- < X ** 3 

and therefore 

Nr ~ N M | + 


k 


fj c-K. O 




( n ,., ' n„, x ) -- iv, 


and 



4 

The calculation of j r may perhaps best be dome by 

comparing the beginners of a number of mesca. es with 

« 

the long circular message, and the values of - r by 
comparing the beginners among themselves. A similar 
technique of actual and « parent numbersof repeats 
can be used. I shall not go into this in detail. 

The formulae required m^y no”’ be assembled. 



decibanage for an r-grem e rarest 
ri eg tive decibanege for unit overlap 
number of occurrences in the statistics of 
the r- gramme 



SJ 2 . total number of letters in the statistics 
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Transp oslitions ciphers 

in yaking calcula tions about substitution ciphers 


we have often found it useful to treat the plain 
language as if it were produced by independent choices 
for the letters, using certain fired frequencies with 
which the letters are chosen. Our method for Vigenere 
and one ofjthe simplified forms of repeat theory could 
be based on this sor t of assumption. With a transposition 
cipher howev r such an assumption would be useless or 
worse than useless, for it would result in the 
conclusion that all transp sit ions were equally likely. 

Me have therefore to fcrnke a slightly less crude 
assumption, and the one hich sug ests itself is that 
the letters forming the plain language are chosen 
consecutively, the probability of get^ ing a particular 
letter depend in, only on whet the letter is and what 
the preceding letter was. It is easily verified that 
if ^ is the pro uoxrtion of bigrammes in^la in 

language and the frequency of the let er ^ then 

the probability^ of a let er ^ following an ie 'J * 
The probability of a piece of pis in language of length L- 


<X 


letters saying 

i i 

'ft, %,«, •• • 

be writ en p 3 J-£ ^ 


is then 

which may also 

. We may 

also calcu late the probability fof a niece of nla in language 


having cert Han given letters in given places, the remainder 
message 

of the ±xx± being unspecified. The nrobobility is given 



2Yf, — S'-*' ) * * * > z u } 


and if the d^ta is that the known letters ere 


k, d*h 



4 * 


a 


\r - / 


n r <Uh 



it is approximately 



C« 


A mote or les^ ro^oro s deduction of this approximation 


section. Foi? t£te present let us see how it can he ap lied. 


and th e other brings the same letters in to oositions 


in favour of th e first as compared -”ith the sea-ond is 


»Ve can ap ily this straightforwardly to the case of bx^xxqbx 

transposition by columns. The fol o- in text is known to 
s irnple 

be a ^transposition of a cert- in type of German text with 
8 key length of not more than 15. 


If we. have two theories xxxjtincfcinE sbout the transposition 
of which the one requires the abova pattern of letters, 


in • hich no t- o of tlffem are consecutive, then the factor 


r 



simple 
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To solve this t ansposition we nay try comparing the 

first six letters S A T P T W which we kno form part 

of one column with each other series of six letters 

in the message, for we know that one such comparison will 

give entir ly bAgramnes occurring in the decode. We may 

try first 

S F 
A A 
T S 
P T 
T A 
W U 
hUt 

The factor for a transposition which brings thes letters 
together, as compared with one which leaves them apart is 


up for the type of traffic in question, *nd ‘ iven to the 


ha If — dec iba ns ) w© get the product by puuiuiuu oucn a t/^dUb 


ling If we consi^v r this co bin^tionas a priori 

"bout 100:1 against (there are 95 let^e s in the message) 
it is a posteriori shout 3000:1 against. Similar scoring 
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may be done for every oos3ible comparison of S A T P T V/ 
ith six consecutiv l°tt rs of the message. The comparison 
mev be made both with SATPTWas earlier and as later 
column; one may also use the last six letters of the 
message H U A N W R . The results of doing this are 
shown i Sxx Fig 7 . The nessa e has be n written out verti- 
cal y. The first column of figures after the message gives 
th^e scores for S A T P T w as earlier column, entered 
against the first letter of the later column, e.g. the 
-36 as calculated ^bove gets entered against the F of 
F A S T A U . The second column after the message ±s 
consists of the scores for H U A N W R as first colPumn 
rid the column b fore the message gives the scores for r T J* 


H U A N W R as second colu . One of these columns ha<^ 

been worked out n detail but in the other two crosses 

have be n put in ,,, he‘ r 'e the scores °re very bad. The scores 

•which eventually turned out to be right are ringed. r he 

iourth comparison ,-."hich did not have to be done scored 

very bsdly viz. -27. Amongst the go id scores which •.• r ere 

wrong there was one scare of 37. It was not dif icult to 

see that this one was wrong as most of the score came from 

WO which requires Z to precede it, and there was no Z in the 

mes a e. Apart fro-^this fact the comparison was about evens, 

lthough if e take into account the fact that there was no 

better score it would be better. |We hve already had 

a case of this kind of thing in con action with Vigenere; 

if the various positions are a priori equally likely and 

the factors, ore then the value 

probability of the ' N 'dw-Aj 

for theJ y alternative is better than .. J. — 
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ji koroi.f deduction of the for ok (A). ( This is someljdng 
of 8 digression). 

The probability of a pieoe of plain language coinciding 
where necess ry with the data (D; i 
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where ^ is the matpia whos-.- ^ (5 coefficient is . 

The formula (A) would then be sc urate if we could say tnat 
for h. ? O , ~ 7/» . This is not 

time, but it is true tnat except or very special values 


for 






as 


oo , <=nd 


th^ convergence is rather rapid. To prove this I sha 11 . 

w. 

as ume hhat th*e eigenvalues xf CS{ t r r ^11 dif even tj 
i-n this c r s we can find a matrix with unit determinant, 
such th^t v~'c%u is in diagonal form 

O & * ~ 
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s ince 


U * UD we hav' 


f \ 5 f ^ « * 


i.e . 


f V >/» : //* 


that is, for each 


^4 j provides 8 solution of 


r 4. i ■ u. t-. 


o) 


with n. . Conversely if we have 8ny solution of C^O 

for some and all , for as (J 

is non singular we can ]au find numbers V such that 
£ s5k. c. for nil pC , and then substituting in hr) 

«■ r •‘t <r 

we set 


£ H V V "f£*- r " r 


i.e. 
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L+0 


>ut ing t- / for all yi se that one member of 

the series ^ is 1, for (E) is certainly 

satisfied. I shall prove that the re aaining eigenvalues 
s tisfy It' U i . e first srov° that if then 

^ • Thiq foll °' s hy multiplying (E) on e^oh 
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for ' hich //*/>' is real and positive. Let satisfy (E) 

with My I i ‘then the eigenvalue for ^ Is j*. 

5 £»* - <t2t */*-<«. 

If ^>Ohas been chosen so small th^ 4 tciiexSxMxS ft <-4 > 
then the L.H.S. is positive for the coefficients in the 
matrix are positive, whereas the R.H.S. is ne -tive for 
suitably chosen V* , unless ^ O . if now jji. > / 
we may tale it that is a?e-l for each pC . As it 

must satisfy Z>-4 z. D it is ne^gtive for some pc. , but 


- M 
u 




then 


3 (A'h^y) ■ 
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and if tins £. is chosen so that tinsxi the L.H.S. i& 

positive whereas hhe R.H.S. is negative for sufficiently 
large V~ . All the eigenv lues therefore satisfy 



as the eigenvalues ~-re -1 dif er^nt^/this means that 
±KXT 3 txtVTSl l f*l< 1 excep for one w*lue of ^ 

y- — 9 tfo ^ tends to &j rax x [g xxx e matrix 

which has only one element different from 0, and That 
a 1 on the diagonal, say in position <r (T . jiSiBHX 

-fcsrai il * x it w x Ithm x i i n ui^fc Calling this matrix \ the series of 

“I ^ 

tends to the limit U X6/. This matrix is 
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nd only one^ hich satisfies y <s f • y a and 

therefor 'the one whose fit A coefficient is ^yj . 


>?% fi+o 
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Th & re is another orobabilitv oroblem thet aris s in 
on ectio with tirade t~ari s position. 'ith a message 
of length , c nd a key length of what is the 

probability that the is* th letter T,T ill be <=t the bottom 
of a column. Let 3 be the length of the short columns 
i.e. 3* let (T * i.*3K . Then if the 

fix, th letter is at the bottom of the cX th column we 
must have ^ uT ^ » p nd there will be (3^0*^' ^ 

short and ^ —3^ long columns amongst these first l*T 

/ & \ 

oolumnsSiKXSH . There sre (jk^lS / ways i n 
which the short and long columns c c n be arranged consistently 
with this, and altogether (i) ways in which the 
columns c n be rr- nged , so that the probabilty of 
tlf'e mth letter being at tbf"& bottom of a column xxac 
is 
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There ill normally be very few terms in th e xnx sum. 
net us^take the case of = message of length 133 arid 
consider tlTe 45th letter, as umdng th^ key length is 
bet een 10 • nd SO (inclusive). «->. L*i33 « - Us' 
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